Effects of word and character frequencies on Chinese character writing

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SUBTLEX-CH: Chinese Word and Character Frequencies Based on Film Subtitles

BACKGROUND Word frequency is the most important variable in language research. However, despite the growing interest in the Chinese language, there are only a few sources of word frequency measures available to researchers, and the quality is less than what researchers in other languages are used to. METHODOLOGY Following recent work by New, Brysbaert, and colleagues in English, French and Du...

متن کامل

Learning Character Representations for Chinese Word Segmentation

We propose a simple yet effective semi-supervised method for improving Chinese Word Segmentation. Our method is based on learning generalizable vector and cluster representations of variable-length character sequences from large unlabeled data, which is then incorporated into a sequence labeling model with the passive-aggressive algorithm as features. We achieve state-of-the-art results on the ...

متن کامل

Chinese Word Segmentation as Character Tagging

In this paper we report results of a supervised machine-learning approach to Chinese word segmentation. A maximum entropy tagger is trained on manually annotated data to automatically assign to Chinese characters, or hanzi, tags that indicate the position of a hanzi within a word. The tagged output is then converted into segmented text for evaluation. Preliminary results show that this approach...

متن کامل

Off-line Character Recognition using On-line Character Writing Information

Recognition of variously deformed character patterns is a salient subject for off-line hand-printed character recognition. Sufficient recognition performance for practical use has not been achieved despite reports of many recognition techniques. Our research examines effective recognition techniques for deformed characters, extending conventional recognition techniques using an on-line characte...

متن کامل

Pragmatic Chinese Lexical Analysis Based on Word-character Hybrid Model

In the field of information and natural language processing, Chinese lexical analysis is important basic step for Chinese, Japanese or other asian language. This paper presents Chinese lexical analysis integrating word-level and character-level information based on hybrid model combining word-based CRF model and latent semi-CRF model. The word-lattice, which represents all candidate outputs, is...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Frontiers in Human Neuroscience

سال: 2017

ISSN: 1662-5161

DOI: 10.3389/conf.fnhum.2017.223.00054